A New DOP Model for Phrase-structure Parsing of Persian Sentences
نویسندگان
چکیده
In this paper we employ a most recent approach to Data Oriented Parsing (DOP), which has named Double-Dop, for Persian sentences. Like other DOP models, Double-Dop parser utilizes syntactic fragments of arbitrary size from a treebank to analyse new sentences, but it extracts a restricted yet representative subset of fragments. It uses only those which are encountered at least twice. The accuracy of Double-DOP is well within the range of state-of-the-art parsers currently used in other NLP-tasks, while offering the additional benefits of a simple generative probability model and an explicit representation of grammatical constructions. Heretofore there isn’t any standard parser for Persian language and this work try to employ Double-Dop Method for parsing Persian sentences.
منابع مشابه
Feature Engineering in Persian Dependency Parser
Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...
متن کاملA simple DOP model for constituency parsing of Italian sentences
We present a simplified Data-Oriented Parsing (DOP) formalism for learning the constituency structure of Italian sentences. In our approach we try to simplify the original DOP methodology by constraining the number and type of fragments we extract from the training corpus. We provide some examples of the types of constructions that occur more often in the treebank, and quantify the performance ...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملA DOP Model for Phrase-Structure Trees
This chapter gives an instantiation of DOP (Scha 1990, 1992; Bod 1992) which is known as Tree-DOP or DOP1 and which will be extensively used in the rest of this book. Tree-DOP combines subtrees from a treebank to parse new sentences. It employs the relative frequency estimator to assign probabilities to subtrees, and computes the probability of a parse tree as the sum of the probabilities of it...
متن کاملA Data-Oriented Parsing Model for HPSG
Data Oriented Parsing (DOP) is based on the idea of processing new input by combining fragments (associated with some probabilities) that are extracted from a treebank. In the simplest case these fragments are subparts of simple phrase structure trees (Tree-DOP). The approach is attractive in many ways but the impoverished representational basis is a serious drawback from a linguistic point of ...
متن کامل